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Abstract 

The dung beetle, Euoniticellus intermedius (Reiche) (Coleoptera: Scarabaeidae) is an important 
ecological and agricultural agent. Their main activity, the burying of dung, improves quality of 
the soil and reduces pests that could cause illness in animals. E. intermedius are therefore 
important for agriculture and for good maintenance of the environment, and are regarded as 
effective biological control agents for parasites of the gastrointestinal tract in livestock. The 
ability of E. intermedius to co-exist comfortably with many microorganisms, some of which are 
important human pathogens, stimulated our interest in its host defense strategies. The aim of this 
study was to investigate the Toll signaling pathway, which is strongly activated by fungi. Gene 
expression associated with fungal infection was analyzed by using 2-D gel electrophoresis and 
mass spectroscopy. Furthermore, the partial adult transcriptome was investigated for the presence 
of known immune response genes by using high-throughput sequencing and bioinformatics. The 
results presented here suggest that E. intermedius responds to fungal challenge via the Toll 
signaling pathway. 
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Introduction 

The dung beetle, Euoniticellus intermedius 
(Reiche) (Coleoptera: Scarabaeidae), belongs 
to the most diverse order of insects, with more 
described species than any other in the animal 
and plant kingdoms (Farrell 1998). The two 
main branches of Class Insecta are the hemi- 
metabolous insects, such as the grasshopper, 
which have incomplete metamorphosis, and 
the holometabolous insects, such as the bee- 
tles, which have complete metamorphosis. 
These branches split about 416 m.y.a. 
(Labandeira and Phillips 1996; Hoffmann 
2004; Erezyilmaz 2006). 

Drosophila melanogaster Meigen (Diptera: 
Drosophilidae) is another holometabolous in- 
sect and is regarded as the most genetically- 
tractable and widely-studied laboratory model 
of the holometabolous insects, but there is ev- 
idence that it is highly specialized and does 
not fully represent invertebrate evolutionary 
characteristics. For example, the beetle Tribo- 
lium castaneum (Herbst) (Coleoptera: 
Tenebrionidae) shares with vertebrates ances- 
tral genes that are not present in D. 
melanogaster. For example, components of 
some signaling pathways that are conserved in 
coleopterans and vertebrates are not conserved 
in D. melanogaster (Van der Zee et al. 2008). 
Moreover, T. castaneum embryogenesis has 
retained the ancestral short-germ band mode, 
which resembles vertebrates (Tautz and 
Sommer 1995; Handel et al. 2000; Liu and 
Kaufman 2005). A recent study on protein 
evolution suggests that coleopterans have 
lower rates of divergence when compared to 
dipterans where a highly accelerated protein 
evolution is evident (Savard et al. 2006). Col- 
eopterans are therefore likely to be more 
suitable for comparative studies against verte- 
brates than the dipterans. 
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Due to their important agricultural benefits, E. 
intermedius are often introduced to control 
ecological damage. Some of the benefits in- 
clude nutrient recycling, improvements to soil 
tilth, and pest control (Bertone et al. 2006). 
Tunneling beetles, such as E. intermedius, are 
beneficial to pasture health, as they enhance 
soil conditions by increasing percolation. Fur- 
thermore, dung beetles reduce the number of 
parasites acquired by cattle (Fincher 1975). 
They also reduce the population of pestiferous 
flies, such as the African buffalo fly (Haemat- 
obia thirouxi potans (Bezzi)), the Australian 
buffalo fly (Haematobia irritans exigua de 
Meijere), the bush fly {Musca vetustissima 
Walker), the face fly {Musca autumnalis De 
Geer), and the horn fly {Haematobia irritans 
irritans (L.)) (Lastro 2006). Consequently, the 
introduction oi E. intermedius into new habi- 
tats has a high impact on the environment. E. 
intermedius was introduced in Texas, USA, to 
control dung accumulation, and the beetles 
spread quickly to Mexico, where they had a 
positive impact on soil fertility and produc- 
tivity. E. intermedius also help control 
nematodes that are potential cattle parasites 
(Montes and Halffter 1998; Anduaga 2004). 

Adult E. intermedius feed on the microbe-rich 
particulate portion of the dung and do not ac- 
tually consume the dung fibers (Holter and 
Scholtz 2007). The larvae, on the other hand, 
consume most of the dung fibers in the brood 
balls. Development from egg to adult takes 5 
to 6 weeks, and adults have a lifespan of about 
2 months. The evolutionary history and life- 
style of dung beetles point to an interesting 
immune system that could be more compara- 
ble to vertebrate innate immunity than D. 
melanogaster in some respects. 

The general trend in insect immune signaling 
is that recognition of pathogens depends upon 
the existence of molecular patterns, called 
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pathogen associated molecular patterns, on the 
cell wall of pathogens and the pattern recogni- 
tion receptors in the immune system of the 
target organism. Lysine containing pepti- 
doglycans found in Gram-positive bacteria 
and b-l,3-glucans in fungi stimulate the Toll 
signaling pathway, while diaminopimelic acid 
containing peptidoglycans found in Gram- 
negative bacteria and some Gram-positive ba- 
cilli stimulate the immune deficiency pathway 
(Hoffmann 2004; Zou et al. 2007; Altincicek 
et al. 2008; Valanne et al. 2011). In D. mela- 
nogaster, it has been shown that virulence 
factors and pathogen associated molecular 
patterns in fungi and bacteria activate the Toll 
pathway via two parallel protease cascades 
that culminate in the cleavage of pro-spaetzle 
to produce the active physiological ligand of 
Toll. The virulence factors, such as cell wall 
components, in live organisms activate the 
Toll pathway via pattern recognition receptors 
and a serine protease cascade, which eventual- 
ly cleaves the inactivate pro-spaetzle to 
release an active C-terminal fragment (C-106), 
which then binds to Toll. The parallel path- 
way is induced by pathogen-secreted 
proteases and proceeds via a cascade involv- 
ing Persephone (Psh) and activation of Toll by 
activated spaetzle (Imler et al. 2004; Chamy et 
al. 2008; Kellenberger et al. 2011; Valanne et 
al. 2011). Intracellular signals transduced by 
the binding of spaetzle and Toll are relayed by 
a phosphorylation cascade that culminates in 
the cytoplasmic release and nuclear localiza- 
tion of NFkB-like transcription factors. The 
Toll pathway leads to the release of the dorsal 
immune factor transcription factor, which ac- 
tivates the expression of various effector 
genes, primarily antimicrobial peptides. 

In recent years, ground-breaking work in col- 
eopteran immunity has been accumulating, 
and the nature of the signaling network is 
emerging. Studies using the larger beetles. 
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Tenebrio molitor and Holotrichia diomphalia, 
show that beetles responded to b-l,3-glucans 
and to lysine containing peptidoglycan and 
polymeric diaminopimelic acid containing 
peptidoglycans by using the Toll signaling 
pathway. Both types of pathogen associated 
molecular patterns form complexes with 
Gram-negative binding protein 3(GNBP3) in 
fungi and peptidoglycan recognition protein in 
Gram-negative bacteria. These complexes ac- 
tivate the serine protease cascade via an apical 
protease known as the modular serine protease, 
foUowed by two types of CLIP domain serine 
proteases at penultimate and terminal posi- 
tions before pro-spaetzle (Park et al. 2010; 
Kellenberger et al. 2011; Ntwasa et al. 2012). 
The studies on beetles revealed important 
pieces of evidence indicating that beetle im- 
munity might differ from that of flies. For 
example, studies conducted on the red flour 
beetle revealed that the peptidoglycan recog- 
nition proteins, used by invertebrates to sense 
pathogens, diversifled in beetles before the 
radiation of holometabolous insects, while in 
D. melanogaster there has been sustained di- 
versiflcation through numerous duplications. 
Furthermore, many immune related genes, 
including the CLIP-domain serine proteases 
and their inhibitors. Toll related proteins, and 
antimicrobial peptides, diversifled extensively 
through evolution most likely because of the 
diverse habitats of the species (Zou et al. 
2007). 

There have been no significant molecular 
studies conducted on E. intermedius thus far, 
except for those involving the relationship be- 
tween prophenoloxidase and immunity 
(Pomfret and Knell 2006). The aim of this 
study was to assess the immune defense path- 
ways that are activated when E. intermedius is 
infected by fungi. It is part of a broader inves- 
tigation of host defense mechanisms adopted 
by the dung beetle. The life cycle of E. inter- 
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medius is presented to highlight that it is dom- 
inated by microbes, some of which are human 
pathogens. The larval and adult stages are par- 
ticularly dependent on these microbes for 
nutrition. Proteomic and bioinformatics analy- 
sis of the adult transcriptome of the dung 
beetle indicates that the Toll signaling path- 
way may be conserved and probably activated 
by fungi. 

Materials and Methods 

Rearing beetles and microscopy 

Beetles were bred in 160 x 130 x 130 mm 
plastic containers, which were half full with 
soil. Cow dung was placed on the soil, and 3 
breeding pairs of beetles (3 males and 3 fe- 
males) were added to the containers. Every 3 
to 4 days, frozen and then thawed cow dung 
was placed in the container. Once a week, the 
soil was sieved, and brood balls were removed. 
If the breeding pair had survived, they were 
moved to a new container with fresh soil, and 
dung and the brood balls were placed in a 
large (400 x 300 x 200 mm) plastic container 
covered with compacted soil. A wet sponge 
was then placed on the soil to keep it moist. 
Once beetles began to emerge, small, plastic 
dishes baited with dung were used as traps to 
capture them within the bins. These beetles 
were used as new breeding pairs or in experi- 
mental work. 

Treatment of beetles 

Adult male and female beetles were infected 
with the entomopathogenic fungus Beauveria 
bassiana (strain 80.2), and samples of haemo- 
lymph were collected before and after 
infection. Ten beetles were placed on a 90 mm 
plate containing B. bassiana (i.e., 1 beetle per 
6.36 cm^). The plate was then placed on a 
shaker for 5 minutes, after which the beetles 
were transferred to a small jar containing ster- 
ile soil and dung and then left overnight to 
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allow an immune response to occur. 

Collection and preparation of haemolymph 
samples 

To collect haemolymph from adult beetles, the 
exoskeleton was pierced below the hind limb 
using a heated tungsten spike. A 10 ml mi- 
crosyringe needle was then pushed through 
this hole. The plunger was slowly withdrawn, 
and between 2 and 5 ^iL of haemolymph was 
collected and pooled from 5 beetles without 
discriminating between genders. The extracted 
haemolymph was diluted 1:4 in extraction 
buffer (0.1% trifluoroacetic acid (to obtain 
pH3), 10 fxg/mL aprotinin, 1 |j.M leupeptin, 15 
\xM phenylmethylsulfonyl fluoride, and 20 
^iM phenylthiourea). The samples were not 
centrifliged to remove hemocytes, because 
both extracellular and intracellular events at 
this stage were of interest. Protein concentra- 
tion was measured by the Bradford method 
(Bradford 1976). 

2-dimensional gel electrophoresis 

Haemolymph proteins were re-suspended in 
2-D PAGE re-hydration buffer (8 M urea, 2% 
CHAPS, 50 mM DTT, 0.2% Bio-lyte, 0.5 % 
Bromophenol Blue) and applied onto 7 cm 
immobilized pH 3-10 gradient strips (Bio-Rad, 
www.bio-rad.com ). Passive rehydration was 
allowed to occur for 12 hours at 20° C. Isoe- 
lectric focusing was performed in the Protean 
lEF system (Bio-Rad) according to the fol- 
lowing program: an initial low voltage (250 V) 
20-minute linear ramping step, followed by a 
high voltage (4000 V) 2-hour linear ramping 
step. The final step was a rapid ramping step 
for 10,000 volt hours. The strip was equili- 
brated in a new rehydration tray and covered 
with equilibration solution one (6 M urea, 2% 
SDS, 0.375 M Tris HCl pH 8.8, 20% glycerol, 
2% DTT). After shaking for 10 minutes, equi- 
libration solution one was replaced with 
equilibration solution two (6 M urea, 2% SDS, 
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0.375 M Tris HCl pH 8.8, 20% glycerol, 2.5% 
iodoacetamide) and shaken for 10 more 
minutes. The strips were then briefly placed in 
SDS running buffer and placed on a 10%) pol- 
yacrylamide separating gel (Laemmli 1970), 
and then overlaid with agarose. The 2-D gel 
was then run for 40 minutes at 200 V and sub- 
sequently stained with Coomasie blue. 

Image Analysis 

The two gels representing samples from treat- 
ed and untreated beetles were digitized using a 
GS800 calibrated densitometer (Bio-Rad), and 
the resulting images were analyzed using the 
PDQuest 2-D Analysis Software Version 6.2 
(Bio-Rad). Spot detection was carried out with 
a sensitivity setting of 102.21, using the treat- 
ed sample gel as the master gel. 464 and 374 
spots were respectively detected in the treated 
and untreated sample gel images. The analysis 
software was used to select spots for identifi- 
cation by mass spectrometry. Many spots 
within the Mj range 10-100 kDa and the pi 
range 4-7, based on whether they were down- 
or up-regulated, were selected. 

Spots were initially matched using the auto- 
mated matching flinction of the program. 
Extended matching was then performed using 
the classical matching function. Both these 
methods involved matching spots based on the 
position of landmark spots. Manual spot 
matching and analysis was then performed 
across the gel images. This resulted in a final 
spot count of 135 spots in the master gel im- 
age, and 114 and 84 in the treated and the 
untreated images respectively. 

Normalization was performed automatically 
by the program and was based on the total 
quantity in all valid spots option. This method 
assumed that changes in density average-out 
across the gels being analyzed. 
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The normalization formula used is as follows: 

Normalized spot quantity = Raw spot quantity x Scaling factor 

Normalization factor (total quantity in all valid spots) 

The scaling factor used was 10*' parts per mil- 
lion. 

Statistical analysis 

Three biological replicates of haemolymph 
extracted from immune-challenged and 
unchallenged adult beetles were analyzed 
using 2D-PAGE and the PDQuest software. 
The means of density values from these 
biological replicates were used to construct 
the graphical representations of the optical 
density values for each spot. Two-sample t- 
tests were performed to test for statistically 
significant differences between the challenged 
and unchallenged groups for each spot at the 
95%) confidence interval. Two ^-tests were 
computed, one assuming equal group 
variances and the other assuming different 
group variances. An F-test for the null 
hypothesis that the immune-challenged and 
unchallenged populations have the same 
variance was also performed. The results of 
these tests were presented as p- and F- values. 
All statistical tests were performed using the 
Statistix 8 software package 
( www.statistix.com ). 

Matrix laser desorption-time-of-flight mass 
spectrometry 

Selected spots were manually excised from 
the gels and placed in a 5Vo acetic acid solu- 
tion. After tryptic in-gel digestion following 
the method of Rabilloud (Rabilloud et al. 
2001), the proteins were analyzed by mass 
spectroscopy either the matrix laser desorp- 
tion-time-offiight or the NanoLC-MS/MS 
(Bfiex III Briiker Daltonics, a CapLC capil- 
lary LC system (Waters, www . waters . com )) 
coupled to a hybrid quadrupole orthogonal 
acceleration time-of-flight tandem mass spec- 
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trometer (Q-TOF Micro, Waters). The sample 
(5 was first concentrated and cleaned in a 
CI 8 PepMap precolumn cartridge (LC Pack- 
ings/Dionex, www.dionex.com ) and then 
separated on-line by the analytical reversed- 
phase capillary column (Pepmap CI 8, 75 )j,m 
inner diameter, 15 cm length; LC Packings) 
under a 200 nL min"' flow rate. The gradient 
profile used consisted of a linear gradient from 
97% A (97.9% H2O; 2% ACN, 0.1% (v/v) 
HCOOH) to 95% B (98% ACN, 1.9% H2O, 
0.1% (v/v) HCOOH) in 45 minutes, followed 
by a linear gradient to 95% B in 3 minutes. 
The spray system (liquid junction) was used at 
3.6 kV. Mass data acquisitions were piloted 
by MassLynx 4.0 software (Waters). NanoLC- 
MS/MS data were collected by data- 
dependent scanning, that is, automated MS to 
MS/MS switching. Fragmentation was per- 
formed using argon as the collision gas, and 
with a collision energy profile optimized for 
various mass ranges of ion precursors. Four 
ion precursors were allowed to be fragmented 
at a time. Mass data collected during a 
NanoLC-MS/MS analysis were processed and 
then submitted to de novo sequencing. Peak 
lists were generated by ProteinLynx 2.1 soft- 
ware (Waters) with internal lockspray 
calibration (leucine enkephalin at 556.2771 
m/z). There was no smoothing. Fragmentation 
spectra were loaded onto the Peptide Sequenc- 
ing software (BioLynx, Waters) and the 
sequences were processed manually before 
being submitted to a BLAST search without 
any taxonomy restriction. 

Searching the Euoniticellus intermedius da- 
tabase for immune genes and other 
bioinformatics tools 

The E. intermedius database contains a partial 
transcriptome of the untreated adult beetle 
(Khanyile et al. 2008). Sequences of innate 
immune response proteins were obtained from 
the UniprotKB/Swiss-Prot database using 
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gene ontology identity number and species 
name. Protein sequences were selected from 
the human, D. melanogaster and T. castaneum 
databases. These included 600 human, 6000 
Drosophila and 12 Tribolium sequences, and 
were deposited in a MySQL local database. 
Using TBLASTX, the local database was 
compared with the E. intermedius unigene da- 
tabase at a cut-off e-value of le-5. 

Results 

Euoniticellus intermedius life cycle in mi- 
crobe-rich environment 

The entire life cycle of E. intermedius occurs 
in cow dung (Figure lA). Their adaptation to 
a habitat that is rich in different types of mi- 
crobes, including human pathogens, suggests 
that dung beetles possess robust host-defense 
mechanisms. Extensive studies on the devel- 
opment of coleopterans have been conducted 
on T. castaneum, a short germ insect, but little 
is known about the developmental biology of 
E. intermedius. 

The embryo develops in the brood ball appar- 
ently without physical contact with the 
interior surface. It is anchored on its posterior 
end ostensibly by the sticky black material 
termed the "maternal gift," as it is placed there 
by the mother (Figure IB). The role of the 
maternal gift has not been proven, but apart 
from acting as an anchor for the embryo, it 
could serve other purposes (Byrne et al. 2013). 
The features of the developing larvae are typi- 
cal of other scarab beetles; with a dorsal 
expansion (or "hump") at the middle of the 
body and caudal flattening (Edmonds and 
Halffter 1978). This seems to be a physical 
adaptation to facilitate movement of the larvae 
in the confines of the brood ball as it rotates 
within the ball to feed on dung. Feeding seems 
to start at the second larval instar, as these lar- 
vae look white and free of dung before this 
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stage. At later stages, the gut system resem- 
bles a "bag of dung." 

Euoniticellus intermedius immune response 
to fungal challenge 

2-D gel electrophoresis of haemolymph from 
adult beetles before and after infection with 
the entomopathogenic B. bassiana revealed 
differential expression of certain genes in re- 
sponse to fungal infection (Figure 2). 
Differentially expressed genes were deter- 
mined by the PDQuest software in three 
separate experiments. 

Initially, matrix laser desorption-time-of-flight 
was used to identify samples after in-gel 
tryptic digestion. The mass list obtained at this 
stage was used to search the United States 
National Center for Biotechnology 
Information (NCBI) database 

(www.ncbi.nlm.nih.gov) without any 
taxonomy restrictions using the Mascot search 
engine. None of the proteins could be 
identified by this method. Nano LC-MS/MS 
analysis was then performed in order to obtain 
sequence data. Short sequence fragments were 
then determined by de novo sequencing. 
BLAST search was performed on the NCBI 
BLASTP 2.2.14 on all non-redundant 
GenBank CDS containing, at this time, 
3695564 entries. For many of the positively 
identified proteins, the MS-BLAST search 
engine was employed using the short peptide 
fragments for a given protein sequence that 
was used as input data into the MS BLAST 
search engine ( http://dove.embl- 

heidelberg . de/Blast2/ msblast . html ) . 

Proteins identified by MS-BLAST were then 
subjected to manual comparison with the E. 
intermedius database 
( http ://fly lab .wits .ac .za/EI/est2uni/blast.php ) 
using TBLASTN (Table 1). In some cases, 
alternative peptides matched sequences in the 
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same contig or singleton. For example, spots 
2203 and 3304, which were up-regulated upon 
fungal infection, were predicted as serine pro- 
teases from the short amino acid sequences 
produced by de novo sequencing using Nano 
LC-MS/MS. Ahernative peptides produced 
from these spots could be assigned to different 
parts of the same contig. Sequences of two 
peptide fragments associated with spot 2203 
(with molecular weight of approximately 30 
kDa) matched parts of the predicted 
CLSContigl protein sequence. Two of these 
fragments had 100% identity with peptide se- 
quences in the CLSContigl -encoded 
polypeptide, which is predicted to be a chy- 
motrypsin-type serine protease like Psh or 
sphinx 1/2. The peptide fragments obtained 
from spot 3304, at approximately 50 kDa, fa- 
cilitated the identification of two contigs 
(CL20Contigl and CL23Contigl) in the E. 
intermedius database using TBLASTN. These 
contigs are also predicted to encode a serine 
protease like Psh or sphinx. Correlation of 
these spots with Psh was consistent with re- 
sponse to fungal challenge as observed in D. 
melanogaster (Imler et al. 2004). Based on 
predicted molecular weights of D. melano- 
gaster sphinx 1/2 (25 kDa) and Psh (43-50 
kDa, sphinx could be assigned to spot 2203 
and Psh to spot 3304, but a definite identifica- 
tion will require full length sequencing and 
more biochemical analysis. 

Bioinformatics analysis of immune genes in 
the Euoniticellus intermedius transcriptome 
database 

Comparison of the E. intermedius transcrip- 
tome database with databases of 
holometabolous {A. melifera, B. mori, D. mel- 
anogaster, and T. castaneum) and 
hemimetabolous {A. pisum) insects showed 
that E. intermedius was more comparable to T. 
castaneum than it is to other insect species 
(Table 2). Approximately 37% of the E. in- 
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termedius transcripts matched orthologues in 
the D. melanogaster database, indicating that 
the D. melanogaster database may be inade- 
quate for functional annotation of the dung 
beetle genome. This apparent divergence 
could be due to evolutionary factors, as the 
primary radiation of coleopterans and dipter- 
ans occurred approximately 284 m.y.a., and 
dipterans have been shown to exhibit an ac- 
celerated rate of protein evolution (Savard et 
al. 2006). 

Functional annotation against the D. melano- 
gaster database revealed proteins involved in 
a range of biological processes and molecular 
functions (Figure 3). Since this ontological 
analysis provided general information about 
response to stress and did not specify im- 
mune-related genes, targeted analysis was 
conducted using more databases, as described 
in materials and methods. Consequently, more 
immune genes were predicted through com- 
parison with genes in the human, T. 
castaneum, and D. melanogaster databases 
(Table 3). 

Using this approach, several molecules asso- 
ciated with the Toll signaling pathway were 
predicted. CL8Contigl, CL20Contigl, 
CLlSContigl, and CL23Contigl were pre- 
dicted as either Psh or sphinx 1/2. Others were 
predicted to be Toll-related genes 
(CL426Contigl, CL673Contigl) and spaetzle 
(CL283Contigl). Manual searches using Dro- 
sophila or Tribolium sequences revealed 
another putative spaetzle-encoding contig 
(C147contigl), a singleton 

(007956_1645_0789_c_s) that is predicted to 
encode b-l,3-glucan-binding protein (GNBP3) 
and GNBPl (CL652Contigl), and pattern 
recognition receptors that recognize fungi and 
bacteria. Since the E. intermedius database 
was created from transcripts obtained from 
uninfected adult beetles and was thus not en- 
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riched for immune-related genes, the detection 
of these Toll pathway genes suggests that they 
were expressed constitutively in the adult bee- 
tle. 

Identity of the predicted serine proteases 

Several proteases were identified by LC- 
MS/MS analysis combined with sequence 
similarity searches using the MASCOT search 
engine as a first layer search on public data- 
bases without taxonomy restrictions. 
Furthermore, the sequences were manually 
compared to the E. intermedius database using 
TBLASTN. Using this approach, sequences of 
peptide fragments obtained from spots 2203 
(molecular weight -30 kDa) and 3304 (mo- 
lecular weight -50 kDa) matched proteins 
encoded by CL8Contigl, CLlSContigl, and 
CL20Contigl. These spots represented pro- 
teins that were up-regulated by fiangal 
infection, and the matching transcripts encod- 
ed proteins that are predicted to be trypsin-like 
serine proteases. During immune response, 
serine proteases (SPs) and the non-catalytic 
serine protease homologues (SPHs), usually 
with N-terminal CLIP-domains, are known to 
mediate extracellular signaling activated by 
bacterial and fungal pathogens. SPHs are non- 
catalytic serine proteases where the catalytic 
triad has undergone mutations. Their flinction 
is poorly understood, but in some instances 
they have been reported to act as cofactors to 
the catalytic proteases (Gupta et al. 2005). In 
H. diomphalia, the prophenoloxidase (proPO) 
activating factor (PPAF-II) acts as a cofactor 
for PPAF-I in a limited proteolysis reaction to 
activate proPO (Kim et al. 2002). 

An alignment of protein sequences encoded 
by these contigs with Psh, Holotricia PPAF-II, 
and other clip-domain SPs and SPHs showed 
that the catalytic triad was conserved in all 
three E. intermedius predicted proteins, indi- 
cating that they may be SPs (Figure 4). The 
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CLIP-domain is found in serine proteases 
such as Grass, Psh, and the SPH PPAF-II, but 
not in SPHs such as sphinx and spheroid (Kel- 
lenberger et al. 2011). The CLIP-domain SPs 
are known to be involved in the final steps of 
the proteolytic cascades. This study relied on 
the catalytic domains for analysis since the 
predicted E. intermedius proteins under inves- 
tigation here did not have the full N-terminal 
sequence. Phylogenetic analysis conducted 
using the catalytic domain sequences showed 
potential relatedness between the E. interme- 
dius proteins and those of other insects 
(Figure 4). In the neighbor-joining tree, Dro- 
sophila Psh and the Holotricia PPAF-II were 
grouped together with CL23Contigl and pos- 
sibly CL20Contigl, while sphinx 1/2 may be 
related to CLSContigl. Interestingly, Triboli- 
um PPAF-II is predicted to be a Psh-like 
cofactor for putative proPO activating pro- 
teinases called TcSP7, SPS, or SPIO (Zou et al. 
2007). This may be evidence for the involve- 
ment of Psh-like proteases in the pro-PO 
pathway. It is notable that the catalytic triad 
was conserved in CLSContigl and mutated in 
sphinx, suggesting that CLSContingl be- 
longed to the class of SPs. 

Other differentially expressed proteins in 
the haemolymph 

In addition to the serine proteases, other pro- 
teins affected by flingal infection included 
pattern recognition receptor molecules, such 
as the apolipophorin III (spots 7203 and 7210 
and CL123Contigl in E. intermedius database) 
and the imaginal disc derived growth factor. 
Apolipophorin III and the imaginal disc de- 
rived growth factor protein spots are 
expressed at low levels in untreated beetles 
and elevated after fiangal challenge. The 
Gram-negative-binding protein and the 
spaetzle processing enzyme were detected by 
manually searching the E. intermedius data- 
base by TBLASTN using the Drosophila 
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proteins. At least one Toll-like protein was 
predicted. 

Peptide sequences obtained from spots 0101, 
3004, 4002, and 9003 indicated possible 
matches with Pez when compared against the 
NCBI database. However, these sequences 
matched the odorant binding protein (OBP) 
family when compared to the E. intermedius 
database. The size range of these spots was 
consistent with that of OBPs and was too 
small for Pez. Protein spot 4002 appeared to 
be down-regulated, but statistical analysis in- 
dicated that the differences between protein 
expression in challenged and unchallenged 
beetles was not significant. Similarly, spot 
9003 and 3004 appeared to be up-regulated, 
but statistical analysis indicated that the ap- 
parent differential expression was not 
significant. Protein in spot 0101 was, however, 
significantly up-regulated following flingal 
infection. The translated E. intermedius se- 
quence (CL32Contigl) showed perfect 
matches with two alternative MS/MS peptide 
fragment sequences that were both identified 
as PBP/GOBP family when compared to the E. 
intermedius database (Figure 6). Interestingly, 
this transcript appeared to have two contigu- 
ous polyadenylation signals similar to those 
reported for human and horse growth hor- 
mones (Masuda et al. 19SS; Ascacio-Martinez 
and Barrera-Saldana 1994). 

Spots 6410 and 5310 were predicted to be 
transferrins and gelsolin respectively. Other 
differentially expressed genes include the es- 
terases and ATP synthase. Some of these 
proteins are usually associated with insect 
immune response, and others are known com- 
ponents of the haemolymph (Vierstraete et al. 
2003; Karlsson et al. 2004). 

Discussion 

Extracellular signal amplification and trans- 
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duction during insect immune response is 
characterized by three categories of molecules 
that transduce the signal into the cell: the pat- 
tern recognition receptors, the SPs, and 
transmembrane receptors. Some components 
of the known extracellular signaling mole- 
cules were evident in E. intermedius based on 
proteomic analysis and the study of the adult 
transcriptome. 

Pattern recognition receptors 

In insects, fungal pathogens are recognized by 
constitutively active and inducible pattern 
recognition receptors called GNBPs. Signals 
are transduced via an SP cascade in the extra- 
cellular space and across the membrane via 
the Toll receptor. In this study, GNBPs were 
not identified by proteomic analysis but were 
predicted in adult transcriptome of E. inter- 
medius during manual searches of the 
database using sequences of Drosophila hom- 
ologues. GNBPs and the related b-l,3-glucan- 
recognition proteins have been reported in 
several coleopterans, such as Tribolium. Evo- 
lutionarily b-l,3-glucan-recognition proteins 
are related to GNBPs and recognize bacterial 
and fungal cell wall components (Zou et al. 
2007). Two inducible GNBP transcripts were 
identified in the burying dung beetle, Nicro- 
phorous vespilloides, by suppression 
subtractive hybridization. These GNBPs clus- 
ter together with other coleopteran GNBPs 
from Tribolium (accession code 
NP_001 164284) and Tenebrio (accession 
code BAG 14263.1) with high bootstrap values 
(Vogel et al. 2011; Vogel et al. 2011). 

Apolipophorin III was initially believed to be 
simply a subunit of lipid transport protein in 
the hemo lymph (Pennington and Wells 2002). 
Later it emerged that apolipophorin III is im- 
portant for insect immunity, and that it acts as 
a pattern recognition receptor that responds to 
b-l,3-glucan pathogen associated molecular 
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patterns found in fiingi (Whitten et al. 2004; 
Rizwan-ul-Haq et al. 201 1). Apolipophorin III 
has been shown to bind to bacterial and fiangal 
cell wall components (Leon et al. 2006a, b) 
and is released into the haemolymph of insect 
larvae treated with lipopolysaccharide (Vogel 
et al. 2011). Here, the predicted E. intermedi- 
us apolipophorin III protein spots (molecular 
weight range 22-25 kDA) are significantly 
induced by fungal infection (Figure 2, Table 
1). The precise effect of apolipophorin III in- 
teraction with pathogens is not well 
understood. It is reported, however, that 
apolipophorin III has a direct involvement 
with pathogens (Zdybicka-Barabas and 
Cytryhska 2011). The imaginal disc derived 
growth factor, also up-regulated by flingal in- 
fection (spot 7301), is a putative recognition 
protein with a chitin-binding lectin domain 
(Levy et al. 2004). 

Serine proteases 

Invertebrates respond to bacterial or flingal 
infection by the activation of three pathways: 
the toll, the immune deficiency, and the 
proPO pathways. The toll pathway is activated 
by bacterial and fungal virulence factors via 
pattern recognition receptors such as GNBP3 
and peptidoglycan recognition proteins. Alter- 
natively, the toll pathway is activated by 
"danger signals" via a rather poorly-defined 
pathway characterized by an SP cascade in- 
volving Psh (Chamy et al. 2008). Psh is also 
implicated in the proPO pathway, as the Man- 
duca sexta orthologous protein HP6 was 
found to induce proPO activation in plasma 
(An et al. 2009). These are distinct pathways 
at the extracellular level, but less so intracellu- 
larly, where signals are transduced via the 
NFkB and Relish transcription factors leading 
to expression of antimicrobial peptides. The 
activation of immune response by Gram- 
positive bacteria and fungi is mediated by cas- 
cades of SP and results in the activation of 
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spaetzle, a cytokine-like molecule that acts as 
ligand for the transmembrane receptor Toll. 
Many of these proteases are synthesized as 
inactive precursor zymogens, which are 
cleaved by limited proteolysis to generate the 
downstream effector molecules. The proPO 
pathway leading to melanization is activated 
by peptidoglycan, b-l,3-glucan ,and by lipo- 
polysaccharide via a protease cascade whose 
components are currently under investigation 
by many researchers (Lee et al. 1998; Jiang 
and Kanost 2000; Jiang et al. 2005; An et al. 
2009). In this study, no clear evidence of mol- 
ecules in this pathway was observed. 

Activation by the major fungal cell wall com- 
ponent, b-l,3-glucan, is mediated via GNBP3, 
foUowed by a protease cascade that culmi- 
nates in the activation of spaetzle. A detailed 
study of this pathway in a coleopteran has 
been conducted on the larvae of Tenebrio 
molitor (Roh et al 2009), providing evidence 
for the existence of the Toll pathway activated 
by virulence factors. Earlier, the activation of 
the Drosophila Toll pathway by fungal and 
bacterial proteases (or so-called danger signals) 
was shown to be mediated by a parallel prote- 
olytic cascade involving Psh (Chamy et al). 
The mechanism by which Psh is activated by 
fungal proteases and the identity of its sub- 
strate have so far not been clearly described. 
This pathway is known to be controlled by a 
serpin known as Necrotic, and all necrotic 
phenotypes are dependent on Psh, indicating a 
direct relationship between the two molecules 
(Ligoxygakis et al. 2002). 

A clear homologue of Psh has not been identi- 
fied in coleopterans. Based on the 
phylogenetic tree, the E. intermedius protein 
encoded by CL23Contigl may be the candi- 
date orthologue of Psh. Protein spots 2203 and 
3304, whose intensity increases significantly 
after flingal infections, yielded several frag- 
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ments that matched parts of the sequences of 
proteins encoded by CLSContigl and 
CL20Contigl. It is noteworthy that a putative 
Necrotic transcript (CLl 1 IContigl) was de- 
tected in the E. intermedius database. Necrotic 
has been identified as a protease inhibitor that 
directly regulates the action of Psh. It has been 
reported that nec phenotypes are pleiotropic, 
including constitutive activation of spaetzle 
and spontaneous melanization (Ligoxygakis et 
al. 2002). This reinforces the possible in- 
volvement of Psh in both the toll and proPO 
pathways. In T. molitor, three serpins that 
negatively regulate the Toll and proPO path- 
ways were identified. These serpins are 
specific for each of the three apical SPs in the 
Toll pathway, and at least two of them are in- 
ducible by Lys-type peptidoglycan and b-1,3- 
glucan (Jiang et al. 2009). 

Conceivably, the identity of these putative Psh 
transcripts can be confirmed by a combination 
of biochemical and genetic experiments in- 
volving psh and nec mutants. In Tribolium, 30 
SPs and 1 8 SPHs were predicted, and a puta- 
tive Psh orthologue named Tc-Sp66 was 
identified on the basis of induction by Gram- 
positive bacteria and fungi (Zou et al. 2007). 
One orthologue was predicted to be Psh based 
on microarray evidence showing that its ex- 
pression is induced only after flingal and 
Gram-positive bacterial challenge. 

This study provides fair evidence for the pres- 
ence of putative SPs that are up-regulated 
upon fungal infection. Better identification of 
these SPs requires cloning of the full-length 
transcripts as well as genetic and structural 
analyses of the genes and their respective pro- 
teins. 

Other proteins 

In addition to the identification of peptides 
that match the putative pattern recognition 
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receptors, trypsin-like SPs, and toll-like pro- 
teins, other proteins known to be involved in 
various aspects of host defense were putative- 
ly identified. These include transferrins, 
gelsolin, the OBP, and at least one protein ty- 
rosine phosphatase. Transferrins are iron- 
binding proteins that regulate extracellular 
iron levels by sequestering iron atoms, thus 
preventing oxidative stress. They are particu- 
larly abundant in hematophagus insects such 
as Tsetse flies (Strickler-Dinglasan et al. 
2006). Transferrin is often up-regulated after 
microbial infection and is part of the insect 
immune response, probably functioning as an 
antibiotic agent against pathogens (Lee et al. 
2006). 

Gelsolin is an actin-binding protein and is 
therefore a structural component of the cyto- 
skeleton. It is found in two splice forms, 
cytoplasmic and secreted, in invertebrates 
such as Drosophila and is involved in cyto- 
skeletal organization and biogenesis (Stella et 
al. 1994). It is involved in amyloid formation 
in vertebrates and haemolymph clotting in in- 
sects (Karlsson et al. 2004). 

Insect OBPs are carrier proteins for transpor- 
tation of small hydrophobic molecules 
through the haemolymph to receptors located 
in different tissues. They are small proteins 
(12-14 kDa) encoded by a divergent gene 
family. A comprehensive study showing di- 
versity of these proteins and similarities 
among insects including coleoptera {T. casta- 
neum) was conducted. This study showed 
emergence of subfamilies that probably arose 
through gene duplications (Gong et al. 2009). 
The existence of different isoforms that arose 
through gene duplications has been demon- 
strated in T. molitor (Graham et al. 2003). In 
Tribolium, remarkable expansion of odorant 
receptors was noted (Consortium 2008). Alt- 
hough OBP sequences differ signiflcantly. 
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they share high structural similarities, charac- 
terized by four conserved cysteines, which 
contribute to two disulfide bonds, a b-barrel 
and two a-helices (Rothemund et al. 1999; 
Graham et al. 2001; Graham and Davies 
2002). There is some evidence of OBP in- 
volvement in insect immunity, but this is 
poorly defined (Levy et al. 2004). OBPs have 
been identified in normal haemolymph of T. 
molitor with no clear function except their po- 
tential to transport small hydrophobic 
molecules to tissues (Graham et al. 2003). In 
Drosophila, an OBP was found to be up- 
regulated upon fungal infection and repressed 
by bacterial infection, suggesting an important 
role for these in immune response (Levy et al. 
2004). Three protein spots were predicted to 
be differentially expressed OBPs, but only one 
of these was up-regulated upon fiangal infec- 
tion. The presence of multiple spots predicted 
to be OBPs suggests existence of different 
forms of the protein. The significance of the 
contiguous polyadenylation signals is current- 
ly not clear. 
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Table I. Up-regulated and down-regulated proteins. 





NCBI Predicted protein 


Accession 
Number 


DBID 


Ion Precursor 


Peptide 


iVl&tching peptide 


Prediction 


Relative spot 
densitv 






101 


Similar lo Fez CG9493-PA 


XP 392377.2 


CL32Comigl 


673.91 


LDALGOECLK 


IDAIGOECLK 


PBP/GOBP family 


Uninfected 0 01 < 


Infected 0.01 < 
0.193 > 0.01 


p = 0.013 F = 
0.3535 


2203F 


AIc2 tiypsiiiugcn 


AAF71315.1 


CL8ContiRl 


1036.10 •■ 


SNAACAAAYG 


SNAACAAAYG 


Trypsin-like serine 
protease 


Uninfected 0.012 < 
0.044 > 0.012 


(:i.«(:omii!3 


CL8Comigl 


760.49 ■■ 


PAYPGVYSSV 


PAYPGVYSSV 


similar lo Trypsin 
29F 


Infected 0.01 < 0.13 
>0.01 


578.83 ■■ 


ADQSSFTLK 




unidentified 


CLlSContiRl 


560.34-" 


TSYPGNLPDR 


TSYPGNIPDR 


trypsin-likc serine 


CL20ContiBl 


890.58 ■ 


FTSPLOYTDTLOLPR 




similar to trypsin- 


p = 0.0054 F = 
0.4388 


2203U 




F.AT414SS.1 


006991 1682 0418 c s 


763.43 ■'" 


VSSAGCASGA 


VSSAGCASGA 


trypsin-like serine 
protease 


CL20Contigl 


CL32Contigl 




TGGMNADGTFNK 


TGGMNADGTFNK 


PBP/GOBP family 






LADVRAHL 




unidentified 


3004 


Similar to Pcz CG9493-PA 


XP 392377.2 


CL32Contigl 


673.91 ■'■ 


LDALGQECLK 


IDAIGQECLK 


PBP/GOBP family 


Uninfected 0.103 < 


Infected 0.31 £0.77 
>0.31 


p = 0.1481 F = 
0 1020 


3304 


Serine protease 


EAT40512.1 




664.40 "' 


VVLTAAHAVLQ 






Umnfected 0.015 < 
0.026 2 0.013 






009895 1663_3195cs 


548.81 •" 


FSLCAOGEQK 


FNLCAOG 


unidentified 


Infected 0.03 < 
0.136 > 0.03 


CL339Contigl 


733.96 


TQGSPLVCLP 


TPLICLP 


similar 10 CGI 7633 
(peptidase) 


p ' 0.048 F ' 
0.2865 


3403 


Chaoptin 


EAT39239 1 


no hit 


626.86 =■ 


YPTNALPSFDK 






Uninfected 0.08 < 
n.lS 1 > 0.08 


Similar to xma.1-2 Cri32'562-PA 


XP mifil.^ 


No hit 


986.02 = 


I.CAF.RTSATQ 












536.82 - 


SAASG 


- 


unidentified 


Infected U UU3 < 
0.003 > 0.003 








687 .13 ■ 


SLSEPESCLT 




unidentified 








626.86 


YPTNALPSFDK 




unidentified 


p = 0.0094 F = 
0.0017 


Similar lo CG7509-PA (with leucine 
lich-iepeab) 


XP_975320.1 




699.37 -'■ 


NGEYLNFG 




uiiidcnliricd 


4001 


ENSANGP00000006676 (putative 

glycoprotein hormone rk-like 
receptor) with Leucine-rich repeats 


XP_317111.1 


CL673Contigl' 


665.65 


EASTLAEFR 




Similar to Toll 6 


Uninfected 0 £ 0 5 0 


Similar to Fez CG9493-PA 


XP_392377.2 


CL32Contigl 


673.88 =■ 


LDALGQECLK 


IDAIGQECLK 


PBP/GOBP fiunily 


lnfeclc<Ul.o:i<0.91 
> 0.03, p = 0.001 


4002U 


GA19474-PA 


EAL30835.1 


CL32Contigl 


737.95 ■• 


QEALDALGQECLK 


IDAIGQECLK 


PBP/GOBP family 


Uninfected 0.46 < 
3.26 > 0.46 


Infected 0.37 < 2.03 
5 0.37 


p = 0.1012F = 
02794 


5102r 


CAD 


AAQ67 196.1 




669.83 2' 


PAETAYLYQK 






Uninfected 0.07 ^ 
0.453 > 0.07 


EMSANGP00000001S29 


XP_3 11 192.2 


- 


630.30 2+ 


QFDDGGDQULK 


- 


- 


Infected 0.094 < 
0.736 > 0.094 


p = 0.0835 F = 
0.4337 


5310 


Getsolin precursor 


EAT42358.1 


014446^1673_2874_c_s* 


768.90 


YAPGGVASGFNQVNR 




Gelsolin 


Uninfected 0.01 £ 
0.14 > 0.01 


Infected 0.1 < 0.54 > 
0.1 


p = 0.0123 F = 
0.0213 


5312 


Gelsolin precursor 


EAT42358.1 


014446_1673_2874_c_s' 


768.89 -■ 


YAPGGVASGFNQVNR 




Gelsolin 


Uninfected 0.07 < 
1.24 > 0.07 


Infected 0.05 < 0.03 
>0.05 


;>-0.003F- 
0.3389 


5406 


ENSANGPOO0OO0O29O1 


EAA07673.3 




905.46 '• 


NSPSEPLNVGLLYR 






Uninfected 0.56 £ 
1.19 > 0.56 


Infected 0 < 0 2 0 


p - 0.0001 



(*) indicates sequences that were not identified directly from searches with the short peptides but indirectly by using the predicted 
NCBI orthologues to search the Euonoticellus database. The graphs of the relative spot density values represent the means of three bio- 
logical replicates. The error bars indicate the standard error of these means. A two-tailed Student's t-test was performed at the 95% 
confidence interval to establish if there was a significant difference between the levels of protein expression between treated and un- 
treated beetles, as represented by their optical density, /j-values less than 0.05 demonstrate a significant difference between the 
challenged and unchallenged groups for each spot. F-values higher than 0.05 indicate that the variance between challenged and unchal- 
lenged groups is not equal. 
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Table I continued. 



6005 


Hypothetical protein 


EAT45043.1 




646.85 ■ 


LSQFDDDVLLK 






Uninfected 0 < 0 > 0 
Infected 0.12 < 0.35 
>0.12 
p= 0.0001 






YP inAQQft 7 

At juoyyo.z 


Ui 1ZZ4 iUDD izyj C 5 




^ni FXT AM 
oKJLjL, I LAIN 


'Zr\l T-Tl AM 
OVJL^L, I LAIN 


Cytochomic P450 


Uninfected 0.75 s 1 
iO.75 


Tiiinsicnt receptor potential channel 
4 


EAT4295.1 


- 


596.83 ■" 


VGAESFDLAGVK 


- 




Infected 0.48 < 0 95 
^ 0 48 
p - 0.995 F - 
0.2873 


6304 


GA18404-PA 


EA133860.1 


011178_1666_3502_c_s 


557 J2 2+ 


NQLLQLEQK 


NQLLQLEQK 


Cytochonnc P450 


Uninfected 0.26 < 
0.5 > 0.26 


CG10639-PA 


NP 609923.2 




715.88 2+ 


ADLDGNLLEDK 


- 


- 


Infected 0.3 £ 0.4 > 
0.3 

p = 0.776 F = 
0.-1353 


6410 


Similar to transferrin 2 CGI 0620 PA 


XP 396618.1 




704.85 


LVEGDGDVAFVK 






Uninfected 0.03 < 
0 58 > 0.03 


Transferrin 


ABU 1834.1 


012952 1546 3581 c s 


562.79 


irVTVPGMTDGK 


IGIPGNPDGR 


similar to CG3 1997 


Transferrin 


ABU 1834.1 




618.31 -• 


TQEEPEAEFR 






Infected 0.08 < 0.94 
>0.08 


Transferrin 


ABU 1834.1 


008196 1603 1902 c s 


777.94 -■ 


VLELSENNVAKPNK 


ELSII>WVAFP 


cIF-4a 


Transferrin 


ARI3 1 834. 1 


012044_1541_2861_c_s* 


713.39 


AGYNAPT.YTI.VK 




Similar to 
transferrin 


p = 0.0365 F = 
0.0882 


7106 


Putative carbohydrate binding 
giycosyltransferase 


ZPOl 357938.1 


CL351Contigl 


981.98 2+ 


SLLDSYPTAYTNVK 


DTYPTAY 


unidentified 


Uninfected 0.02 < 
0.04 i 0.02 Infected 
0.08 < 0.27 > 0.08 p 
- 0.0475 F - 0.8 


7203 






CL123Contigl 


595.33 


RAESFDQLVK 


AESFDQLVK 


apolipophorin-lll 


Uninfected 0.04 £ 
0.78 > 0.04 
Lifcclcd 0.35 < 2.6 > 
0.35 
p = 0.0067 F = 
0.011 


7210 


ENSANGP0000002 1 859 


XP 313925.1 


CLt23Contigl 


532.78 


AGENVQAFTK 


AGENVQAFTK 




Uninfected 0.05 £ 
0.06 > 0.05 


GA18404-PA 


EAL33860.1 


011178_1666_3502_c_s 


557.33 -■ 


NQLLQLEQK 


NQLLQLEQK 


Apolipophorin III 


Infected 0.061 < 

0.3820.061 
/) = 0.0167 F = 
0.4043 


73011- 


innF PRnTTTV 

iLJKjr rt\\jin,ijy 


IMP nnimftfi47 i 

INr l/U i UJOot / . i 




713.39 2+ 








Uninfected 0.22 £ 
0.323 > 0.22 


Oligopcptidase 


EAT48748.I 


012961 1674 0476 cs 


836.91 2+ 


NYPAPLYSL 




similar to chitinase 

5 


Infected 0.01 £0.95 
>0.01 


730 lU 


IDGF4CG1780-PA 


XP 396769.2 


015002 1611 0756 cs 


725.94 2 • 


TPGLLSYPEV 


IIYPLPLYSV 




p = 0.1045 F = 
0.0021 


7303 


Similar to general receptor for 
phosphoinositides 1 CG11628-PA 


XP 001123194.1 




620.33 2- 


IIWLETKPSNK 






Uninfected 0.0294 i 
0.0486 > 0.0294 


IDGF4CG1780-PA 


XP_396769.2 


- 


725.93 2+ 


TPGLLSYPEV 


- 


- 


Infected 0.044 < 
0.286 > 0.044 
p -0.0149 F- 
0.3044 


8306 


Chaoptin 


EAT39239.1 


- 


626.87 -■ 


YPTNALPSFDK 


- 


- 


Uninfected 0.21 < 

0.573 > 0.21 
Infected 0.0133 < 
0.0133 £0.0133 
p - 0.05 F - 0.004 


8402 


Similar to CG13886-PA 


XPJ94035.2 




930.08 -■ 


LLTQQAECL 






Uninfected 0.17 < 

0.34 > 0.17 
Infected 0.294 < 
0.586 > 0.294 
p = 0.5147 F = 
0.2498 


9003 


Similar to Pez CG9495-PA 


XP_392377.2 


CL32CoTitigl 


673.88 


LDALGQECLK 


IDAIGQECLK 


PBP/GOBP family 


Uninfected 0.082 £ 
0.113 > 0.082 


Conserved hypothetical protein 


EAT39907.1 




550.34 '■ 


VSSATMSR 






Infcclcd 0.035 < 
0.323 S 0.035 
















p = 0.1506 F = 
0.3361 



Table 2. Comparison of the dung beetle database with those of other invertebrates. The total number of unigenes in the database vi/as 
2662. 




Database 


Number of unigenes 


% 




Acyrthosiphon _pisum 


899 


34 


Apis_melifera 


998 


38 


Bombyxmori 


951 


36 


Drosophilajnelanogaster 


995 


37 


Tribolium caslaneum 


1297 


49 


All databases 


1334 


50 
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Table 3. Predicted immune genes found in Euonoticellus intermedius transcriptome database. 



r lyidu^t^uuiiicUaSc lu 


I nooiturn u^lnulu^uc 




DFOsophila nuclsotids 






003936_1548_1151_c_s 


NA 


Dorsal interacting protein 4 


NA 


Ubiqui tin-conjugating 
enzyme H2 N 


CG7656-PC 


004657_1617_2913_c_s 


NA 


Dorsal interacting protein 4 


NA 


U biquitin-conjugating 
enzyme E2 N 


UbcD6-PA 




MA 


K^s~rcldtcG protein 


MA 


MA 


RVinl -PA 


006324_I633_n830_c_s 


NA 


Defense protein l(2)34Fc 


NA 


NA 


NA 


UUUJ't-J 1 UUO jUOO u s 


NA 




NA 


NA 




007008 1624 1394 c s 


Toll-like 


NA 


NA 


NA 


NA 




NA 


AjlS'IClalCU piUlCUl Ixal~a 


NA 


NA 




007472_1671_2184_c_s 


NA 


Tyrosine-protein kinase 


hop 


NA 


na 


007472_1671_2184_c_s 


NA 


lyrubiiic-pruLcin Kuiuac 
hopscotch 


Takl 


NA 


na 




MA 


MA 


inor 


MA 


MA 


008655 1605 1151 cs 


NA 


L/ciumicu cpiuci llial 

aULlJl CXUlaUil V laullll 1 


NA 


NA 


Deafl-PB 


008815_1595_2123_c_s 


NA 


Tyrosine-protein kinase 


NA 


Ribosomal protein S6 


Src42A-PA 


009315_I547_2493_c_s 


NA 


Mitogen-activated protein 


p38b 


Cyclin-dependent kinase 1 


Cdk9-PA 


009933 1651 3213 c s 


NA 


Dual oxidase 


NA 


NA 


Pxd-PA 


m CiftdT. iss? 'xo'X'i r c 


MA 


Ras-related protein Ral-a 


IMA 


MA 




010966 1592 1457 c s 


NA 


NA 


dnrl 


NA 


NA 




MA 


MA 


ncc 


MA 


opnt-x \j 




MA 


Till /^V 1/1 4 C A 


MA 


MA 




CL147Contigl 


NA 


^4itogen~activ3tcd 
phosphatase 3, 


NA 


Dua.1 specificity protein 
phosphatase 3 


Mkp-PB 


CLlSContigl 


NA 


opuiiixi ui opiuiuLZ ur 
persephone 


NA 


subcomponent-like protein 


CG5255-PA 


CL20Contigl 


NA 


Sphinx 1 or persephone 


NA 


^^umpicmeiii v^ir 
DUucumpunciiL-iiKc pruLcin 


Jon65Aiv-PA 


CL23Contigl 


NA 


Sphinx 1 or persephone 


NA 


Complement Clr 
subcomponent-like protein 


CG10472-PA 




MA 


Protein sp&etzle 


MA 


MA 


MA 


CL426Contigl 


Toll-7-like protein 


Toll-6orToll-9 


ISw 


NA 


NA 


CL444Contigl 


NA 


Protein croquemort 


crq 


NA 


NA 




MA 


*tUi3 llUUaUlIlaX piUlCIII iJ\J 




MA 


Rn^/^-PR 
ivp>jU-rD 


CL515Contigl 


NA 


hihibitor of nuclear factor 

&dUpa-D KJIloaC sUUUIlll 


Mpk2 


Cychn-dependent kinase 1 


NA 


CL515Contigl 


NA 


Mekkl, isoform B 


Mpk2 


Cyclin-dependent kinase 1 


NA 




Toll-6 


1 8 u/lippIpr/Pmtpin Tnll 


18w 


Toll-like receptor 3 


Als-PA 


CLRContigl 


NA 


Persephone/ Sphinx 1 
/Sphinx2 


NA 


Complement Clr 
subcomponent-like protein 


Try29F-PA 


Manually identified genes 
007956 1645 0789 c s 




Gram-negative bacteria 
binding protein 3 








CL47Contigl 




Spaetzle processing 
enzyme 






CL652Contigl 




Gram-negative bacteria 
binding protein 1 
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Maternal Gift 



3 mm 




Figure I. The life cycle oi Euonoticellus intermedius. A. Developmental stages from embryo to adult. B. An embryo In a brood ball show- 
ing the maternal gift. High quality figures are available online. 
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Figure 2. 2-D gel electrophoresis of Euonoticellus intermedius hemolymph proteins. The gel was stained with Coomasie blue. It shows 
protein spots in unchallenged (A) and in infected beetles (B). Proteins found in treated beetles only, in untreated beetles only, and in 
both treated and untreated beetles are shown. C. Optical density measurements (N= 3) of protein spots. High quality figures are availa- 
ble online. 
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Figure 3. Gene Ontology Analysis of Euonoticellus intermedius genes. From the total of 2662 unigenes, 933 were functionally annotated 
by blast result using Drosophila melanogaster. A gene ontology study was then conducted on the 933 unigenes made up of 338 contigs and 
595 singletons. The D. melanogaster gene ontology association file was used to annotate the unigenes with Gene Ontology terms, and 
results were stored in a local Mysql database. The gene ontology analysis is implemented by one of the modules of the EST2uni pipeline 
(Forment et al. 2008). The retrieval of all 2662 unigenes, left-joined by gene ontology terms, resulted in a WEGO (Web Gene Ontology) 
annotation plot. The Native Format file and WEGO were used for plotting GO annotation results as a histogram. The gene ontology 
terms have I to 7 levels of detail, I being most general and 7 most detailed/specific. The histogram shown here is of GO terms at level 
2. High quality figures are available online. 
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perseplkoiie 194 Ad 
Holotricia rrU' 4n 
CLJiroatlgl 717 
OL^Kokllql M* 
spkliucl 2)} aa 

a.(Co«tli|l aa 



persrpKoiu* )94 tut 
Holotrlcla PPIF 41) 
CL?)Cantial 707 aa 
(X^OContlgl 74) aa 
vphinxl 753 OA 
cphinx? 753 aa 
CLiroRtlal 750 aa 



t>rrsrphon« 394 aa 
Holotilcla Pr>r 415 aa 
a.:}Contliil 707 a-i 
a.70Contigl 749 aa 
sphinxl 753 aa 
splaiix7 75 3 aa 
tltContlql 25* aa 



pcrsrplboikc 394 aa 
Holotrlcla rpftF 415 aa 
CL7 3rontl<|l 707 aa 
a.20Contlgl 249 aa 
spMnxl 253 aa 
sphinx? 753 aa 
CLOContigl 250 «a 



pervephone 394 A4 
Holotricia PPU- 411 aa 
CL2>ContiijI 207 aa 
CL2*Contigl 249 aa 
sphlnxl 253 aa 
sphinxJ 253 aa 
CLIContigl 250 oa 



. I ... I ... I .... I ... I ... I I .... I I . . . I . . I . . I . . . I . I . . 1 . . I . I . . . 

mjaisuxGn^jscssvEuvivciuaamnMKiatTsaKZPLiiK^ni^^ 
'HauviTuiir(;iBWNsvn>uwiai'(4i »sEnppi>ixiviiaPLi'M,TiLPi(CbTtwoiN(vcivv)f«(»c\m« 



IH 



lu lit m i« iti u> IN 

.... I ... I .... I I .... I .... I I . I .... I .... I I 1 I .. I ... I I 

svsTssTTsm pms(«VDVPTrGscmi8uaadaiiica(its(4KA/i^^ 

VTPBVIinTCECir»IIS3nuC<ZSyUVCC(l.PEG6AfLrTPSPTPigin29ffSFn 




B. 



91 



59 



82 



100 



■ perscphonc 394 aa 
Holoiricia PPAF415 aa 

■ CL23Contigl 207 aa 
■CL20Contigl249aa 
•CLSConligl 250 aa 

• sphinxl 253 aa 

• sphinjC 253 aa 



Figure 4. Sequence alignment of known proteases with predicted database polypeptides. A. Drosophila Persephone, sphinx proteins and 
Holotricia diomphalia PPAF were aligned with putative Euonoticellus intermedius SP domain proteins. The alignment was performed using 
the Clustal W program. The position of the catalytic triad is marked with asterisks. The triad is conserved in all three £. intermedius pro- 
teins. Peptide sequences obtained from the mass spectrometric analysis of the protein spots 2203 (single line) and 3304 (three lines) are 
aligned with one of the three £. intermedius sequences as shown. B. Phylogenetic tree of the abovementioned proteins. The tree was 
constructed using the neighbour joining method and the MEGA 5 software with 500 bootstrap replications. High quality figures are avail- 
able online. 
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Drosophila 



[Pattem recognition molecular d"",^ 
patterns (PAMPS) ^-S*"'"" 



Extracellular recognition 
factors 



GNBP3 



Danger signals 



Gram +ve bacteria 
Lys-type PGN 



GNBPl 

PGRP-SA 

PGRP-SD 



Tribolium 

VirulenceFactors ^r^F 

D- glue an 

I 

GNBP3 
(007936_1645_0789_c_s) 



Persephone 
(CL20contigl) 



I 



ModSP 



-* SP 



Persephone 



Lserineprotease cascade 



Serpin 



I Nec-1 

(CLlllContigl) 



Grass 



(CL47Contigl) 



Sphinx Spirit Spheroide 

(CLSontigl) 



MSP 



Pro-MSP 



SPE 



Pro-Spz 



^Transmembrane receptor 



^Intracellular signalling 




Spatzle processing enzyme 
(CL47ConUgl) 




SAE 



Pro-SAE 



Pro-SPE 



i 



Pro-Spz Spz 

(CL283Contigl)> 



ToU (CL673 conUgl) 



ToU (CL673 contigl) 



r 



Effectors 



Antimicrobial peptides 



Antimicrobial peptides 



Figure 5. Schematic diagram of extracellular immune signaling in Drosophila and Tribolium. Pathogens are recognized by pattern recogni- 
tion patterns (PRRs) based on pathogen associated molecular patterns (PAMPs) present on the pathogen. PAMPS may be Lys-type or 
DAP-type peptidoglycans. The pathway can also be activated by endogenous factors produced in live organisms (danger signals or viru- 
lence factors). The Toll pathway is activated when Gram-positive bacteria are sensed by the peptldoglycan recognition proteins (PGRP- 
SA). Fungal PAMPS are sensed by the glucan binding protein 3 (GNBP3). These PRRs are secreted proteins, and when they form com- 
plexes with appropriate PAMPS, they initiate the serine protease cascades in the hemolymph, culminating in the cleavage of pro-spaetzle, 
converting it to active spaetzle, the endogenous ligand of the Toll receptor. In Drosophila, a parallel protease cascade Is activated by dan- 
ger signals such as fungal and bacterial proteases and activates the Toll pathway via the trypsin-like serine protease known as Persephone 
and spaetzle. Binding of spaetzle to Toll causes a conformational change and subsequent recruitment of intracellular signaling molecules. 
Intracellular signal transduction is mediated by a phosphorylation cascade that culminates in the release of NF-kB-like transcription fac- 
tors from a complex with cactus. These transcription factors activate expression of genes that encode for antimicrobial peptides and 
other effectors. High quality figures are available online. 
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121 

421 

481 



ACCAAAATAGAAATTCTTCGTTGCCTTCGCCCTCTTGGTTGTTGCCGCTTCGGCAATCAC 

KFFVAFALLVVAASAI T 
TCCAGAAAGGAAAGCTGAAATTGACGCCATCGGTCAAGAATGCCTTAAATCCAGCGGCGT 

? E R K A E IDAIGQSCLK S S G V 
TGAAAGACCATTGGTAGCCGATCTTCGCAAAGGTGTCTTCTCCGAAGATGCCAAATTAAA 

ERPLVADLRKGVFSEDAKLK 
GAACTTCGTTAGCTGCGTTTTCGTCAAAACAGGCGGTATGAACGCTGATGGCACATTCAA 

NFVSCVFVK TGGMNADGTFN 
CAAAGACGTTGTCAGGAAGGACTTTGGAGACCGCAAAGAAGTTGTTGACGCTGCTCTTCT 

KDVVRKDFGDRKEVVDAALL 
CTGCACTGACTCCCACGGAGCCACCGTTGACGAAACCGCCTACTTGGCCTACAAATGCTT 

CTDSKGATVDETAYLAYKCF 
CAGAAACAACGTCCCATCTGACTACAAACCAAACTGGTAAACTCTAGACTGCCACACCGA 

RNNVPSDYKPNW* 
TCCATTTCTCTCTTCTA AATAAATAAA 7ATGACTTTTTAAATAAAAAAAAAAAAAAAAAA 
AANAAAAAAAAAAAAAGAAAAAAAAAAGGGGGGGGAA 



Figure 6. Translation of CL32Contigl . Peptide fragment sequenced obtained by MS/MS are underlined. Putative overlapping tandem 
polyadenylation signals in bold and double underline. High quality figures are available online. 
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